Domain Architecture in Homolog Identification

نویسندگان

  • Nan Song
  • R. D. Sedgewick
  • Dannie Durand
چکیده

Homology identification is the first step for many genomic studies. Current methods, based on sequence comparison, can result in a substantial number of mis-assignments due to the alignment of homologous domains in otherwise unrelated sequences. Here we propose methods to detect homologs through explicit comparison of domain architecture. We developed several schemes for scoring the similarity of a pair of protein sequences by exploiting an analogy between comparing proteins using their domain content and comparing documents based on their word content. We evaluate the proposed methods using a benchmark of fifteen sequence families of known evolutionary history. The results of these studies demonstrate the effectiveness of comparing domain architectures using these similarity measures. We also demonstrate the importance of both weighting critical domains and of compensating for proteins with large numbers of domains.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology

Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...

متن کامل

Sequence Similarity Network Reveals Common Ancestry of Multidomain Proteins

We address the problem of homology identification in complex multidomain families with varied domain architectures. The challenge is to distinguish sequence pairs that share common ancestry from pairs that share an inserted domain but are otherwise unrelated. This distinction is essential for accuracy in gene annotation, function prediction, and comparative genomics. There are two major obstacl...

متن کامل

DAMAGE IDENTIFICATION IN STRUCTURES USING TIME DOMAIN RESPONSES BASED ON DIFFERENTIAL EVOLUTION ALGORITHM

An effective method utilizing the differential evolution algorithm (DEA) as an optimisation solver is suggested here to detect the location and extent of single and multiple damages in structural systems using time domain response method. Changes in acceleration response of structure are considered as a criterion for damage occurrence. The acceleration of structures is obtained using Newmark me...

متن کامل

Identification of an Evolutionarily Conserved Ankyrin Domain-Containing Protein, Caiap, Which Regulates Inflammasome-Dependent Resistance to Bacterial Infection

Many proteins contain tandemly repeated modules of several amino acids, which act as the building blocks that form the underlying architecture of a specific protein-binding interface. Among these motifs and one of the most frequently observed is ankyrin repeats (ANK), which consist of 33 amino acid residues that are highly conserved. ANK domains span a wide range of functions, including protein...

متن کامل

Capsule Polysaccharide Synthase 1 (CPS1) Homolog in Aspergillus fumigatus: A Gene Disruption Study

Introduction: Aspergillus fumigatus is the leading cause of invasive aspergillosis in immunocompromised patients with a high rate of mortality. Despite introduction of several classes of antifungal drugs, the limitations of current therapies have prompted an intense research toward the discovery of new antifungal compounds. In a recent study, several potential drug targets were identified based...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006